40 research outputs found

    A hidden Markov model-based algorithm for identifying tumour subtype using array CGH data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The recent advancement in array CGH (aCGH) research has significantly improved tumor identification using DNA copy number data. A number of unsupervised learning methods have been proposed for clustering aCGH samples. Two of the major challenges for developing aCGH sample clustering are the high spatial correlation between aCGH markers and the low computing efficiency. A mixture hidden Markov model based algorithm was developed to address these two challenges.</p> <p>Results</p> <p>The hidden Markov model (HMM) was used to model the spatial correlation between aCGH markers. A fast clustering algorithm was implemented and real data analysis on glioma aCGH data has shown that it converges to the optimal cluster rapidly and the computation time is proportional to the sample size. Simulation results showed that this HMM based clustering (HMMC) method has a substantially lower error rate than NMF clustering. The HMMC results for glioma data were significantly associated with clinical outcomes.</p> <p>Conclusions</p> <p>We have developed a fast clustering algorithm to identify tumor subtypes based on DNA copy number aberrations. The performance of the proposed HMMC method has been evaluated using both simulated and real aCGH data. The software for HMMC in both R and C++ is available in ND INBRE website <url>http://ndinbre.org/programs/bioinformatics.php.</url></p

    Screening for New Biomarkers for Subcortical Vascular Dementia and Alzheimer's Disease

    Get PDF
    BACKGROUND: Novel biomarkers are important for identifying as well as differentiating subcortical vascular dementia (SVD) and Alzheimer's disease (AD) at an early stage in the disease process. METHODS: In two independent cohorts, a multiplex immunoassay was utilized to analyze 90 proteins in cerebrospinal fluid (CSF) samples from dementia patients and patients at risk of developing dementia (mild cognitive impairment). RESULTS: The levels of several CSF proteins were increased in SVD and its incipient state, and in moderate-to-severe AD compared with the control group. In contrast, some CSF proteins were altered in AD, but not in SVD. The levels of heart-type fatty acid binding protein (H-FABP) were consistently increased in all groups with dementia but only in some of their incipient states. CONCLUSIONS: In summary, these results support the notion that SVD and AD are driven by different pathophysiological mechanisms reflected in the CSF protein profile and that H-FABP in CSF is a general marker of neurodegeneration

    Improving the Robustness of Variable Selection and Predictive Performance of Regularized Generalized Linear Models and Cox Proportional Hazard Models

    No full text
    High-dimensional data applications often entail the use of various statistical and machine-learning algorithms to identify an optimal signature based on biomarkers and other patient characteristics that predicts the desired clinical outcome in biomedical research. Both the composition and predictive performance of such biomarker signatures are critical in various biomedical research applications. In the presence of a large number of features, however, a conventional regression analysis approach fails to yield a good prediction model. A widely used remedy is to introduce regularization in fitting the relevant regression model. In particular, a L1 penalty on the regression coefficients is extremely useful, and very efficient numerical algorithms have been developed for fitting such models with different types of responses. This L1-based regularization tends to generate a parsimonious prediction model with promising prediction performance, i.e., feature selection is achieved along with construction of the prediction model. The variable selection, and hence the composition of the signature, as well as the prediction performance of the model depend on the choice of the penalty parameter used in the L1 regularization. The penalty parameter is often chosen by K-fold cross-validation. However, such an algorithm tends to be unstable and may yield very different choices of the penalty parameter across multiple runs on the same dataset. In addition, the predictive performance estimates from the internal cross-validation procedure in this algorithm tend to be inflated. In this paper, we propose a Monte Carlo approach to improve the robustness of regularization parameter selection, along with an additional cross-validation wrapper for objectively evaluating the predictive performance of the final model. We demonstrate the improvements via simulations and illustrate the application via a real dataset

    Empirical simulation extrapolation for measurement error models with replicate measurements

    No full text
    We present a variation of the simex algorithm (J. Amer. statist. Assoc. 89 (1994) 1314) appropriate for the case in which the measurement error variance(s) are unknown but replicate measurements are available. The method used pseudo errors generated from random linear contrasts of the observed replicate measurements. An attractive feature of the new method is its ability to accommodate heteroscedastic measurement error.Errors-in-variables Heteroscedasticity Logistic regression Method of moments Simulation Variance components

    Hearing Loss in Alzheimer’s Disease Is Associated with Altered Serum Lipidomic Biomarker Profiles

    No full text
    Recent data have found that aging-related hearing loss (ARHL) is associated with the development of Alzheimer’s Disease (AD). However, the nature of the relationship between these two disorders is not clear. There are multiple potential factors that link ARHL and AD, and previous investigators have speculated that shared metabolic dysregulation may underlie the propensity to develop both disorders. Here, we investigate the distribution of serum lipidomic biomarkers in AD subjects with or without hearing loss in a publicly available dataset. Serum levels of 349 known lipids from 16 lipid classes were measured in 185 AD patients. Using previously defined co-regulated sets of lipids, both age- and sex-adjusted, we found that lipid sets enriched in phosphatidylcholine and phosphatidylethanolamine showed a strong inverse association with hearing loss. Examination of biochemical classes confirmed these relationships and revealed that serum phosphatidylcholine levels were significantly lower in AD subjects with hearing loss. A similar relationship was not found in normal subjects. These data suggest that a synergistic relationship may exist between AD, hearing loss and metabolic biomarkers, such that in the context of a pathological state such as AD, alterations in serum metabolic profiles are associated with hearing loss. These data also point to a potential role for phosphatidylcholine, a molecule with antioxidant properties, in the underlying pathophysiology of ARHL in the context of AD, which has implications for our understanding and potential treatment of both disorders

    Monte Carlo Estimation of g(µ) from Normally Distributed Data with Applications

    No full text
    Abstract We derive Monte Carlo-amenable solutions to the problem of unbiased estimation of a nonlinear function of the mean of a normal distribution. For most nonlinear functions the maximum likelihood estimator is biased. Our method yields a Monte Carlo approximation to the uniformly minimum variance unbiased estima-tor for a wide class of nonlinear functions. Applications to problems arising in the analysis of data measured with error and the secondary analysis of estimated data are described

    Estimating a nonlinear function of a normal mean

    No full text
    We derive a Monte-Carlo-amenable, minimum variance unbiased estimator of a nonlinear function of a normal mean and the variance of the estimator. Applications to problems arising in the analysis of data measured with error are described. Copyright 2005, Oxford University Press.

    A multivariate predictive modeling approach reveals a novel CSF peptide signature for both Alzheimer's Disease state classification and for predicting future disease progression

    No full text
    <div><p>To determine if a multi-analyte cerebrospinal fluid (CSF) peptide signature can be used to differentiate Alzheimer’s Disease (AD) and normal aged controls (NL), and to determine if this signature can also predict progression from mild cognitive impairment (MCI) to AD, analysis of CSF samples was done on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset. The profiles of 320 peptides from baseline CSF samples of 287 subjects over a 3–6 year period were analyzed. As expected, the peptide most able to differentiate between AD vs. NL was found to be Apolipoprotein E. Other peptides, some of which are not classically associated with AD, such as heart fatty acid binding protein, and the neuronal pentraxin receptor, also differentiated disease states. A sixteen-analyte signature was identified which differentiated AD vs. NL with an area under the receiver operating characteristic curve of 0.89, which was better than any combination of amyloid beta (1–42), tau, and phospho-181 tau. This same signature, when applied to a new and independent data set, also strongly predicted both probability and rate of future progression of MCI subjects to AD, better than traditional markers. These data suggest that multivariate peptide signatures from CSF predict MCI to AD progression, and point to potentially new roles for certain proteins not typically associated with AD.</p></div
    corecore